Add Query Resource Based Eviction#7488
Conversation
Signed-off-by: Essam Eldaly <eeldaly@amazon.com>
Signed-off-by: Essam Eldaly <eeldaly@amazon.com>
Signed-off-by: Essam Eldaly <eeldaly@amazon.com>
Signed-off-by: Essam Eldaly <eeldaly@amazon.com>
Signed-off-by: Essam Eldaly <eeldaly@amazon.com>
| } else { | ||
| // TODO: Consider wrapping logger to differentiate from querier module logger | ||
| queryable, _, queryEngine = querier.New(t.Cfg.Querier, t.OverridesConfig, t.Distributor, t.StoreQueryables, rulerRegisterer, util_log.Logger, t.OverridesConfig.RulesPartialData, nil) | ||
| queryable, _, queryEngine, _ = querier.New(t.Cfg.Querier, t.OverridesConfig, t.Distributor, t.StoreQueryables, rulerRegisterer, util_log.Logger, t.OverridesConfig.RulesPartialData, nil) |
There was a problem hiding this comment.
No support for rulers. I see. it can be added in a follow up PR.
Signed-off-by: Essam Eldaly <eeldaly@amazon.com>
Signed-off-by: Essam Eldaly <eeldaly@amazon.com>
Signed-off-by: Essam Eldaly <eeldaly@amazon.com>
Signed-off-by: Essam Eldaly <60357054+eeldaly@users.noreply.github.com>
SungJin1212
left a comment
There was a problem hiding this comment.
Thanks for the PR! I left a few comments.
| func NewQueryEvictor( | ||
| monitor resource.IMonitor, | ||
| registry *QueryRegistry, | ||
| cfg configs.EvictionConfig, | ||
| logger log.Logger, | ||
| reg prometheus.Registerer, | ||
| component string, | ||
| ) (*QueryEvictor, error) { | ||
| if !cfg.Enabled() { | ||
| return nil, nil | ||
| } | ||
|
|
||
| e := &QueryEvictor{ | ||
| monitor: monitor, | ||
| registry: registry, | ||
| cfg: cfg, | ||
| logger: logger, | ||
| evictionsTotal: promauto.With(reg).NewCounterVec(prometheus.CounterOpts{ | ||
| Name: "cortex_query_evictions_total", | ||
| Help: "Total number of queries evicted due to resource pressure.", | ||
| ConstLabels: map[string]string{"component": component}, | ||
| }, []string{"resource"}), | ||
| } | ||
|
|
||
| e.Service = services.NewBasicService(nil, e.running, nil) | ||
| return e, nil | ||
| } |
There was a problem hiding this comment.
nit: NewQueryEvictor() currently appears to return nil error in all code paths.
| if len(victims) == 0 { | ||
| continue // no running queries to evict | ||
| } | ||
|
|
There was a problem hiding this comment.
Would it help to emit a warning or debug log for threshold breached but no evictable query found?
That could make tuning min_query_age much easier in production.
Signed-off-by: Essam Eldaly <eeldaly@amazon.com>
Signed-off-by: Essam Eldaly <60357054+eeldaly@users.noreply.github.com>
justinjung04
left a comment
There was a problem hiding this comment.
Thanks! Few small comments
| # eviction. Supported values: fetched_samples, fetched_series, | ||
| # fetched_chunks, fetched_chunk_bytes. | ||
| # CLI flag: -querier.query-protection.eviction.eviction-metric | ||
| [eviction_metric: <string> | default = "fetched_samples"] |
There was a problem hiding this comment.
any particular reason you chose fetched_samples as default? is there some data you can share about each metric's correlation to the query heaviness?
| type ErrQueryEvicted struct{} | ||
|
|
||
| func (e *ErrQueryEvicted) Error() string { | ||
| return status.Error(codes.ResourceExhausted, "resource limit reached").Error() |
There was a problem hiding this comment.
Could you verify if this triggers retry in query frontend? I belive we do not want evicted queries to be retried?
What this PR does:
This PR builds on the current resource based throttling infrastructure (#6674) to allow for evicting currently running queries. This is currently only implemented on querier pods but can be extended to other pods similar to resourced based throttling.
Flags
All flags are prefixed with
-querier.query-protection.eviction.threshold.cpu-utilization: Max CPU utilization (0–1) before evicting the heaviest query (default 0)threshold.heap-utilization: Max heap utilization (0–1) before evicting the heaviest query (default 0)check-interval: How frequently the evictor checks resource utilization (default 1s)cooldown-period: Number of check intervals to wait after an eviction before evicting again (default 3)eviction-metric: Metric used to determine the heaviest query (fetched_samples,fetched_series,fetched_chunks,fetched_chunk_bytes) (defaultfetched_samples)min-query-age: Minimum time a query must be running before it becomes eligible for eviction (default 10s)The evictor will be disabled and will not check every
check-intervalif both cpu and heap utilization are disabled (set to 0).How it works
This feature is completely disabled and the registry will not be created if
cpu-utilizationandheap-utilizationare set to 0. If either of them is larger:check-intervalmin-query-agewill be evaluated from heaviest based oneviction-metric. The heaviest query will be evictedcheck-intervalbefore checking again if threshold is breached.Why the current metrics?
The current metrics are not the best to detect the root cause for high heap, however, they are readily available and can be used as a proxy until work in Prometheus/Thanos is done to allow for better metrics.
peak_samplesis currently only available inquery_statsafter the query is completed and we have no way of tracking current heap usage by a query. Any new metrics can easily be added to this structure later.Metrics
cortex_query_evictions_total{resource="cpu|heap", component="querier"}: Counter increments by one for every eviction that occurs. A single query may lead to multiple increments if it retries and ends up evicted again.note: make doc added the config to
store-gateway.mdfile even though this is not currently implemented on there as it is built on top of resource based throttling.Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]docs/configuration/v1-guarantees.mdupdated if this PR introduces experimental flags